Multi-view Chinese Treebanking

نویسندگان

  • Likun Qiu
  • Yue Zhang
  • Peng Jin
  • Houfeng Wang
چکیده

We present a multi-view annotation framework for Chinese treebanking, which uses dependency structures as the base view and supports conversion into phrase structures with minimal loss of information. A multi-view Chinese treebank was built under the proposed framework, and the first release (PMT 1.0) containing 14,463 sentences is be made freely available. To verify the effectiveness of the multi-view framework, we implemented an arc-standard transition-based dependency parser and added phrase structure features produced by the phrase structure view. Experimental results show the effectiveness of additional features for dependency parsing. Further, experiments on dependency-to-string machine translation show that our treebank and parser could achieve similar results compared to the Stanford Parser trained on CTB 7.0.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending and Scaling up the Chinese Treebank Annotation

We discuss on-going efforts to scale up the Chinese Treebank annotation and extending Chinese treebanking to informal genres like conversational speech, news groups and weblogs, as well as discussion forums. The original Chinese Treebank annotation scheme was designed for formal genres such as newswire and magazine articles, where the language is very formal and each document is carefully edite...

متن کامل

A Chinese Dependency Syntax for Treebanking

This paper presents a Chinese dependency syntax for treebanking. The syntax contains 13 word classes and 34 dependency types. A format of treebank based on the syntax is also proposed for the applications of computational and general linguistic research. Some experiments show that the treebank based on the proposed dependency syntax can be used for training and evaluating the dependency parser ...

متن کامل

Treebank of Chinese Bible Translations

This paper reports on a treebanking project where eight different modern Chinese translations of the Bible are syntactically analyzed. The trees are created through dynamic treebanking which uses a parser to produce the trees. The trees have been going through manual checking, but corrections are made not by editing the tree files but by re-generating the trees with an updated grammar and dicti...

متن کامل

A Semantics Oriented Grammar for Chinese Treebanking

Chinese grammar engineering has been a much debated task. Whilst semantic information has been reconed crucial for Chinese syntactic analysis and downstream applications, existing Chinese treebanks lack a consistent and strict sentential semantic formalism. In this paper, we introduce a semantics oriented grammar for Chinese, designed to provide basic supports for tasks such as automatic semant...

متن کامل

Using Treebanking Discriminants as Parse Disambiguation Features

This paper presents a novel approach of incorporating fine-grained treebanking decisions made by human annotators as discriminative features for automatic parse disambiguation. To our best knowledge, this is the first work that exploits treebanking decisions for this task. The advantage of this approach is that use of human judgements is made. The paper presents comparative analyses of the perf...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014